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Abstrakt 

Mnoho soucasnych pocitacovych systemu pouziva dynamicke datove ci fidici struktury pfe- 
dem neomezene velikosti. Tyto datove struktury maji casto charakter stromu nebo se 
daji zakodovat jako stromy s nekterymi dodatecnymi ukazateli nad stromovou kostrou. 
Teto skutecnosti vyuzivaji nektere v soucasne dobe intenzivne studovane techniky for- 
malm verifikace, ktere reprezentuji nekonecne mnoho stavu konecnym stromovym auto- 
matem. Nicmene v soucasnosti neexistuje efektivni a fiexibilni implementace knihovny 
pro stromove automaty, ktera by byla pro tyto techniky vhodna. Cilem teto diplomove 
prace je takovouto knihovnu poskytnout. Pfedlozeny text nejdnve popisuje zaklady teorie 
konecnych stromovych automatu a regularmch stromovych jazyku. Dale jsou prozkoumany 
existujici implementace knihoven pro stromove automaty a ruzne verifikacni techniky pro 
systemy se stromovou strukturou. Pote se text zaobira navrhem reprezentace stromoveho 
automatu a algoritmu provadejicich standardni jazykove operace nad touto reprezentacl, 
nacez nasleduje popis implementace knihovny. Prostfednictvim provedenych experimentu 
uka^iujeme, ze knihovna muze konkurovat ostatnim dostupnym knihovnam pro praci se 
stromovymi automaty, pficemz jeji vykon v urcitych oblastech je fadove vyssi. 

Abstract 

Numerous computer systems use dynamic control and data structures of unbounded size. 
These data structures have often the character of trees or they can be encoded as trees 
with some additional pointers. This is exploited by some currently intensively studied 
techniques of formal verification that represent an infinite number of states using a finite 
tree automaton. However, currently there is no tree automata library implementation that 
would provide an efficient and flexible support for such methods. Thus the aim of this Mas- 
ter's Thesis is to provide such a library. The present paper first describes the theoretical 
background of finite tree automata and regular tree languages. Then it surveys the cur- 
rent implementations of tree automata libraries and studies various verification techniques, 
outlining requirements for the library. Representation of a finite tree automaton and algo- 
rithms that perform standard language operations on this representation are proposed in 
the next part, which is followed by description of library implementation. Through a series 
of experiments it is shown that the library can compete with other available tree automata 
libraries, in certain areas being even significantly superior to them. 
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I think that I shall never see 

A poem lovely as a tree. 

A tree whose hungry mouth is prest 

Against the sweet earth's flowing breast; 

A tree that looks at God all day, 

And lifts her leafy arms to pray; 

A tree that may in summer wear 

A nest of robins in her hair; 

Upon whose bosom snow has lain; 

Who intimately lives with rain. 

Poems are made by fools like me, 

But only God can make a tree. 

— Joyce Kilmer, Trees 
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Chapter 1 

Introduction 



Donald Knuth, the pioneer of the analysis of algorithms, says that computer scientists 
love trees more than anybody else [1 . Indeed, trees play a crucial role in computer science. 
They recur in many of its fields, from the representation of programs in the form of abstract 
syntax trees |2j , through the use for fast data retrieval in search trees [3, , to tree topologies 
of computer networks. It is not surprising that trees are often a natural way to represent a 
model of many types of systems including safety-critical systems 

Software errors in safety-critical systems may cause severe losses of money and, in the 
worst case, even human lives (the Ariane 5 failure is perhaps the best-known case of an 
expensive software failure [4]). There are several means which help to avoid software bugs 
in such systems, one of them being verification based on formal mathematical methods, 
formal verification. 

Formal verification of computer systems has gained in popularity in recent years. One 
of the reasons of this increased interest is the fact that testing of systems, which do not need 
to be very large, can never cover 100 % of cases in an acceptable time (even for systems with 
finite state spaces; infinite state systems per se cannot be completely tested). A popular 
approach to formal verification is model checking (introduced in early 1980s by E. M. Clarke, 
E. A. Emerson, J. P. Queille and J. Sifakis), a method based on checking whether a given 
system conforms to given specification by systematically searching the state space of the 
system. However, in the real world, there exist systems with state spaces that are infinite, 
though they often have regular structure, e.g. systems with unbounded queues or stacks. 
As one of the approaches to handle infinite-state systems where states have a linear (or 
effectively linearizable) structure, regular model checking has been proposed [5j. Regular 
model checking is based on the following ideas: configurations of the systems being verified 
are represented as finite words over finite alphabet, transitions are represented as relations 
over words. Then finite (word) automata over the alphabet can be used to represent sets 
of configurations of the system and finite (word) transducers can be used to express the 
transition relation. 

However, there are also systems that do not have a linear structure which would enable 
natural encoding of their configuration into finite words. A special case of these are systems 
with tree-like structure, such as parametrised tree networks or heaps. Moreover, it turns out 
that many more general graph structures that cannot be easily linearized can be effectively 
encoded using trees (see e.g. [6]). For such cases, it is convenient to generalise the method 
to regular tree model checking f71, where finite tree automata, a generalisation of finite 
automata to trees, and finite tree transducers are used. 

Nonetheless when used for reachability analysis, regular model checking in general may 
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suffer from problems with infinite number of configurations as the transducers may generate 
ever new configurations. Therefore various acceleration techniques that ensure finiteness of 
the method for many real world problems have been proposed. These methods may need 
to perform sophisticated operations upon finite tree automata (transducers). In order to 
conduct the operations in verification of non-trivial systems in an acceptable time, smart 
data structures and algorithms must be used. However, currently there is no efficient tree 
automata library that would be suitable for such operations (although MONA, which is 
discussed in Section 3.2 includes a fairly sophisticated deterministic finite tree automata 
implementation) . 

The aim of this work is to design an efficient library that would be suitable for sophisti- 
cated tree model checking techniques while being flexible enough to be used even for meth- 
ods which have not yet been developed. The library focuses on an efficient representation 
of finite tree automata that work with large alphabets. Unlike most other tree automata li- 
braries, we use symbolic representation to encode transition functions of tree automata. An 
exception in this sense is MONA which also uses symbolic representation. Moreover, unlike 
MONA, our library allows to handle nondeterministic finite tree automata, which turns out 
to be crucial for the efficiency of many verification approaches. We have developed a set 
of algorithms that conduct standard language operations on symbolically represented non- 
deterministic finite tree automata, as well as algorithms that perform several non-standard 
operations, such as reduction according to downward simulation or inclusion checking based 
on antichains. A prototype of the library has been implemented and evaluated through a 
series of experiments. 

The text is divided into several chapters. Chapter |2] introduces terms, trees, finite tree 
automata and regular tree languages, while Chapter [3] discusses available libraries that 
support work with tree automata. In Chapter |4j various formal verification techniques 
using tree automata are studied and requirements for the library are outlined. Chapter [5] 
describes the proposed tree automata representation and algorithms for standard operations 
that work with this representation. This is followed by a description of the implementation 
of the library in Chapter |6j Chapter [7] gives experimental results. Finally, Chapter |8] 
summarizes the work and outlines its possible further development. 
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Chapter 2 

Theoretical Background 



This chapter introduces standard definitions, which were taken from [8j. Theorems are pre- 
sented without proofs as they can be found in the same source. First, terms over a ranked 
alphabet and trees are defined, followed by a description of tree automata and an analysis 
of closure properties of regular tree languages. Then the concept of tree automata min- 
imisation is introduced, and decision problems for tree automata languages are discussed. 
Finally, a definition of tree transducers concludes the chapter. 

2.1 Terms and Trees 

This section introduces terms over ranked alphabet and trees. 
2.1.1 Terms 

A ranked alphabet is a couple (J-", Arity) where is a finite set of symbols and Arity is a 
mapping Arity : — )• N (N denotes the set of non-negative integer numbers). Arity{f), 
where f £ T, is the arity of /. The set of symbols of arity p is denoted by Tp. We assume 
that the set J^o (the set of constants) is nonempty. Furthermore, we use parenthesis and 
commas for a short declaration of symbols with arity, such as /(, ) for a binary symbol /. 

Let be a set of constants called variables such that X D J-q = 0. We define a set of 
n variables as Xn. The set T(J^,X) of terms over the ranked alphabet J-' and the set of 
variables X is the smallest set defined by: 

• J-Q ^ TiT, X) and 

• A- C T(T, X) and 

• if p > 1, / e Jp and ti , . . . , tp G T{T, X), then /(ti , . . . , tp) e T[T, X). 

T(J^, 0) is abbreviated as T(J^). Terms in T(J^) are called ground terms. A term t G T{J^, X) 
is linear if each variable occurs at most once in t. 



Example 1. Let = {a, i^O, /(,)} be a ranked alphabet. The ground term t = f{f{a, a), g{a)) 
can be represented in a graphical way as: 
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2.1.2 Trees 

Let E he a set of labels and Vos{t) C N* be a prefix-closed set. Mapping / : Vos{t) — ^ E 
is called a finite ordered tree t. A term t G T{J^, X) can then be viewed as a finite 
ordered ranked tree with its leaves labeled with variables or constants and its internal 
nodes labeled with symbols of positive arity, with out-degree equal to the arity of the label. 
Term t G T{T,X) can then be defined as a partial function i : N* — >^ J^U ^ (with domain 
'Pos{t)) that satisfies the following properties: 

{i) Vos{t) is nonempty and prefix-closed, 

{a) \/p G Vosit), if t{p) eTn,n> 1, then {j \ pj G Pos(t)} = {!,..., n}, 
(in) Vp G ros{t), if t{p) e X U To, then {j \ pj G Pos(t)} = 0. 
In the following we confuse trees and terms. 

2.1.3 Substitutions 

A substitution cr is a mapping cr : A' — > T{T, X) (and a ground substitution is a mapping 
a : X ^ T{J^)) where there are only finitely many variables which are not mapped to 
themselves. The domain of a substitution a is the subset of variables x £ X such that 
a{x) 7^ X. The substitution {xi ti,. . . ,Xn in} is the identity of X \ {xi, . . . ,Xn} 
and maps Xi e X on ti e T{F, X), for every index 1 < i < n. The following extends 
substitutions to T{T,X): 

V/ G Jn, Vii, . . . , t„ G T{F, X) a if{ti, tn)) = f {a{ti), a{tn)) • (2.1) 

We confuse a substitution and its extension to T{ J^, X). Postfix notation is often used for 
substitutions: ta is the result of applying a to the term t. 

2.1.4 Contexts 

A linear term C G T{T, Xn) is called a context and the expression C[ti, . . . , i„] for ii, . . . , t„ G 
T{T) denotes the term in T{T) obtained from C by replacing variable xi by ti for each 
1 < i < n, i.e. C[ti, . . . ,tn\ = C{x\ ti, . . . , a;„ C^{F) denotes the set of contexts 

over (xi, . . . 

Contexts with a single variable are denoted as C{T). A context is trivial if it is reduced 
to a variable. Given a context C G C{J-), we denote the trivial context by C^, is equal 
to C and, forn > 1, C" = C'^iC] is a context in C(^). 
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2.2 Regular Tree Languages and Finite Tree Automata 

This section introduces various kinds of finite tree automata. 



2.2.1 Nondeterministic Finite Tree Automata 

A (bottom-up) nondeterministic finite tree automaton (NFTA) over is a 4-tuple A = 
{Q,J-,Qf,A), where Q is a finite set of states (Q H J-" = 0), C Q is a set of final states 
and A is a set of transition rules 



fiqiixi), . . .,qn{Xn)) q{f{xi, ■ . .,Xn)), 



(2.2) 



where n G N, f G Fn-, q,qi, ■ ■ ■ ,qn £ Q and xi, . . . ,Xn G Af. The move relation is 
defined by: let t,t' G T{TUQ), 



A 



{ 3CEC(^UQ),3Mi,...,n„ gT(^), 

3/(gi(a:i), . . . , qn{xn)) ^ q{f{xi, Xn)) G A, 
t = C[f{qi{ui), . . . ,g„('U„))], 
[ t' = C[q{f{ui,...,Un))]. 



(2.3) 



—7-^ is the reflexive and transitive closure of — 5'_4- A ground term t £ T{F) is accepted by 
an NFTA A = {Q, F,Qf, A) if there exists q € Qf such that 



t^q{t). 



(2.4) 



The set of all ground terms accepted by NFTA A (the language of A) is denoted as £ (A) . 
A set C of ground terms is regular if there exists such NFTA A that C = C{A). If two (or 
more) NFTA accept the same tree language, they are equivalent. 



Example 2. Consider the ground term t = f{f{a, a), g{a)) from Example [Tj Let A = 
{Q,J=',Qf,A) be an NFTA and ti G T{FUQ), ti = f{qi{f{a,a)),q2{g{a))) be a partially 
processed term t hy A, t — ti. Assume that f{qi{xi), q2{x2)) — )■ (7i(/(xi, X2)) G A; then 
the following sequence of transitions is possible: 




A I f 9 f 9 



a a a 




a a a a a a 

If qi G Qf, then t G C{A). 



An NFTA A is complete if there is at least one rule 

/(gi(xi), . . . , qn{xn)) q{f{xi, Xn)) G A (2.5) 

for all n > 0, / G Fn, and qi, . . . ,qn £ Q- A state q £ Q is accessible if there exists a ground 
term t such that t -^*^ q{t). An NFTA is reduced when all of its states are accessible. 
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The set of transition rules can also be defined as the set of rules of an alternative 
form: f{qi, . . . , Qn) —)■(?. A move relation can be defined as before, except that instead of 
preserving the structure of the term, the NFTA A replaces subtrees with its states. A term 
t is then accepted by an NFTA A if 

t ^ q (2.6) 

where q E Qf. 

2.2.2 Nondeterministic Finite Tree Automata with e-rules 

The definition of NFTA with e-rules is similar to the definition of NFTA, except for the set 
of transition rules which may now also contain e-rules of the form q q' , i.e. the state is 
changed without processing an input symbol. 

Theorem 1. If JC is accepted by an NFTA with e-rules, then C is accepted by an NFTA 
without e-rules. 

2.2.3 Deterministic Finite Tree Automata 

A deterministic finite tree automaton (DFTA) is an NFTA where there are no two rules 
with the same left-hand side (and no e-rules) in A. It is unambiguous, i.e. there is at most 
one run for every ground term, which means that there is at most one state q £ Q such 
that t q. 

Theorem 2. Let C be a regular set of ground terms. Then there exists a DFTA that accepts 
C. 

2.3 Closure properties 

2.3.1 Union 

Theorem 3. The class of regular tree languages is closed under union. 

Let us have the following two complete NFTAs (an NFTA can always be made com- 
plete by adding missing transitions that all point to a sink nonaccepting state): = 
(Qi, Ai) and^2 = ((52, ^, <3/2, A2). Now let us construct NFTA ^ = {Q,J^,Qf,A) 

that accepts jC.{A) = jC.{Ai) U jC{A2), where Q = Qi x Q2, Qj = Qfi x Q2UQ1 x Qf2, and 
A = Ai X A2 where 

A1XA2 = {fiiqi,q[),...,iqn,qn))^iQ,Q')\ 

f{q^, . . . , g„) ^ g G Ai, f{q\, q'J ^ q' e A2}. (2.7) 

This construction preserves determinism, i.e. if Ai and A2 are deterministic, then A is 
deterministic too. 

2.3.2 Complementation 

Theorem 4. The class of regular tree languages is closed under complementation. 

Let C{A) be a regular tree language and A = {Q, J-", Q j , A) be a complete DFTA. Then an 
NFTA A' = iQ,T, Q \ Qf, A) accepts the complement of set C in T{T). 
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2.3.3 Intersection 

Theorem 5. The class of regular tree languages is closed under intersection. 

Closure under intersection follows directly from closure under union and complementation 
using De Morgan's law: 

£in£2 = Au£^ (2.8) 

where C denotes the complement of set C in T[T). The construction that preserves deter- 
minism follows: Let Ai = {Qi,T,Qfi,Ai) and A2 = (Q2, Q/2i A2) be NFTA. Consider 
NFTA A = {Q,J^,Qf,A) such that Q = Qi x Q2, Qf = Qfi x Q/2 and A = Ai x A2. It 
holds that C{A) = C{Ai) n £(^2)• 



2.4 Minimisation of Tree Automata 

An equivalence relation = on T{T) is a congruence on T{T) if for every / G J-n 

Ui = Vi,l<i<n=^ /(ui, ...,Un) = f{vi, . . .,Vn). (2.9) 

It is of finite index when there are only finitely many =-classes. An equivalent definition 
is that a congruence is an equivalence relation closed under context, i.e. for all contexts 
C £ C{T), if u = V, then C[u] = C[v]. Assume £ is a regular tree language, then =c on 
T{T) is defined by: u =c v if for all contexts C £ C{T), 

C[u] eC^ C[v] G C. (2.10) 

Myhill-Nerode Theorem for Tree Languages. The following three statements are 
equivalent: 

(i) C is a regular tree language, 

(ii) C is the union of some equivalence classes of a congruence of finite index, 
(Hi) the relation =c is a congruence of finite index. 



An interesting point of the proof of the theorem above is the proof of ( iii) =^ (^: 

Proof. Let Qmin be the finite set of equivalence classes of =£. Let us define the transition 
relation A^m as the smallest set such that 

f{[ui],...,[Un]) [/(ni,...,U„)] G Amin (2.11) 

for all / G J^, Ml, . . . , n„ G T{T), where [u] denotes the equivalence class of term u. The 
definition of Amin is consistent because =c is a congruence. Let Qmiuf = {[u] \ u £ C}. 
The DFTA Amin = {Qmin,^, Qminf,Amin) accepts the tree language C. □ 

It can be proved that Amin is minimum (in the number of states) and unique up to a 
renaming of states. 
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2.5 Top-down Tree Automata 



A nondeterministic top-down finite tree automaton (top-down NFTA) over ^ is a 4-tuple 
A = {Q, -F, /, A), where Q is a finite set of states, / C Q is a set of initial states and A is 
a set of transition rules 

q{f{xi, Xn)) fiqiixi), . . . , qn{xn)), (2.12) 

where n > 0, / G g, gi, . . . , g„ € Q, xi, . . . , x„ G A". The move relation is easily deduced 
from the move relation for bottom-up NFTA. 

The tree language C(A) accepted by A is the set of ground terms t for which there is 
an initial state q E I such that 

q{t)^t. (2.13) 

Note that the expressive power of bottom-up and top-down nondeterministic finite tree 
automata is the same. However, top-down DFTA are strictly less powerful than top-down 
NFTA. 

2.6 Decision Problems and their Complexity 

This section summarises some decision problems of regular tree languages and their com- 
plexity in the context of RAM machines. Note the increased complexity with respect to 
regular word languages, which implies an even stronger need for a very careful design and 
various heuristic optimizations of working with finite tree automata. 

• The fixed membership problem (determining whether a certain ground term is ac- 
cepted by a fixed finite tree automaton, i.e. the automaton is not the input of the 
decision procedure) is ALOGTIME-complete. 

• The uniform membership problem (determining whether a certain ground term is 
accepted by a given finite tree automaton, i.e. the automaton is also the input of the 
decision procedure) can be decided in linear time for DFTA and in polynomial time 
for NFTA. 

• The emptiness problem (determining whether the language accepted by given finite 
tree automaton is empty) is decidable in linear time. 

• The intersection non-emptiness problem (determining whether there is at least one 
ground term accepted by each finite tree automaton from a given finite sequence of 
tree automata) is EXPTIME-complete. 

• The finiteness problem (determining if the language of a given finite tree automaton 
is finite) is decidable in polynomial time. 

• The com,plem,ent emptiness problem (determining whether a given finite tree automa- 
ton accepts every ground term) can be decided in polynomial time for DFTA and it 
is EXPTIME-complete for NFTA. 

• The equivalence problem (determining whether two given finite tree automata accept 
the same language) is decidable. 

• The singleton set problem (determining whether a given finite tree automaton accepts 
only a single ground term) is decidable in polynomial time. 
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2.7 Tree Transducers 



2.7.1 Bottom- up Tree Transducers 

A nondeterministic bottom-up tree transducer (NBUTT) is a 5-tuple U = {Q, T, T',Qf, A), 

where Q is a set of states (Q D J- = 9, Q Ci J-' = 9), J- and J^' are finite nonempty sets 
of input symbols and output symbols, Qf C Q is a set of final states and A is a set of 
transduction rules of the following two types: 

f{qiixi),...,qnixn)) qiu), (2.14) 

where / G Tji, u € T{T', Xn), q,qi, ■ ■ ■ ,qn & Q, and xi, . . . ,Xn & Xn, and 

q{xi)^q'{u) (£-rule), (2.15) 

where u G T{T',Xi), q,q' £ Q, and xi E X^. 

Let t, t' G T{F yj F' yjQ). The move relation -^jj is defined as: 

3/(gi(xi), . . . ,g„(x„)) ^ g(u) G A, 

3C G c(j'u -F'ug), 

3«i,...,n„GT(r), (2.16) 
t = C[f{qi{ui),...,qn{un))], 
t' = C[q{u{xi ^Ui,...,Xn-^ u„})]. 

The reflexive and transitive closure of -^jj is The relation induced by U (also denoted 

as U) is: 

U = {{t,t') I t ^tj q{t'),te T{T),t' G T{T'),qe Qf} . (2.17) 

A transducer is e-free if there is no e-rule in A. If all transduction rules are linear 
(no variable occurs twice in the right-hand side), then the transducer is linear. It is non- 
erasing if, for each rule, at least one symbol from J^' occurs in the right-hand side. In 
a complete (or non-deleting) transducer, for every rule f{qi{xi),...,qn{xn)) — ?• qiu), for 
every x,, (1 < i < n), Xj occurs at least once in u. An e-free transducer where there are no 
two rules with the same left-hand side is called deterministic (DBUTT). 

2.7.2 Top-down Tree Transducers 

A nondeterministic top-down tree transducer (NTDTT) is a 5-tuple D = {Q,T,T',Qi,A), 
where Q is a set of states {Q H = $, Q Ci J^' = $), T and T' are finite nonempty sets of 
input and output symbols, Qj C Q is a set of initial states and A is a set of transduction 
rules of the following two types: 

q{f{x-i,...,Xn)) u[qi{xi^),...,qp{xi^)], (2.18) 

where f e J^n, u e CP{F'), q,qi, . . . ,qp e Q, Xi^, . . . ,Xi^ e Xn, and 

q{x) u[qi{x), qp{x)] (e-rule), (2.19) 

where u G 6^(7^'), q,qi, . . . ,qp e Q, x e X. 



u 
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Let t, t' G T{J^ U F' yjQ). The move relation is defined as: 



t^t' ^ { 

D 



3q{f{xi,.. .,Xn)) u[qi{xi-,),...,qp{xi^)] G A, 
3C £C{J^UT'\JQ), 
3ui, ...,Une T{T), 
t = C[q{f{ui, . . .,Un))], 
^ t' = C[u[qi{vi), qpivp)]] where vj = Uk if Xi- = Xk- 



(2.20) 



The reflexive and transitive closure of — is — The relation induced by D (also denoted 
as D) is: 

D = {{t,t') I q{t) t',te T{T),t' e T{T'),qe Qi} . (2.21) 

£-free, linear, non-erasing, complete, deterministic (DTDTT) top-down tree transducers are 
the same as in the bottom-up case. 
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Chapter 3 

Existing Tree Automata Libraries 



This chapter describes several implementations of finite tree automata libraries focusing on 
a couple of the most interesting from our point of view: Timbuk and MONA. 

3.1 Timbuk 

Timbuk [9j is a collection of tools for achieving proofs of reachability over term rewriting 
systems and for manipulating tree automata. This system is written in OCaml, a popular 
functional programming language. Version 2.2 of Timbuk was surveyed; although newer 
version 3.0 is currently available, this version has abandoned the tree automata library 
present in earlier versions as the tool now focuses on reachability analysis and equational 
approximations of term rewriting systems. This library is a free software (available under 
the GNU LGPLv2 ^0\ licence) distributed for free, therefore it was possible to study the 
implementation. 

The tree automaton is implemented as a tuple of lists: a list of symbols (an alphabet), 
a list of state operators, a list of states, a list of final states, a list of transitions and a list of 
prioritary transitions. The supported operations on tree automata are the standard ones: 
intersection, union, language emptiness, deletion of inaccessible states, determinisation and 
others. Since states and transitions are represented as lists, the aforementioned operations 
are implemented in a straightforward way. The library is able to construct a tree automaton 
directly from a given term rewriting system. 

3.2 MONA 

MONA is a tool (released free of charge under the GNU GPLv2 licence [12J) that 
implements decision procedures for the weak second-order theory of one or two successors 
(WS1S/WS2S). These types of logic are notable for the following reasons: 

WSIS Biichi claims in [13] that WSIS has an expressive power equivalent to regular 
expressions, i.e. it can be used to denote the class of regular languages. 

WS2S According to [Mj (who further refers to Thatcher and Wright [E]), WS2S is equiv- 
alent to the class of regular tree languages. 

Indeed, MONA uses finite automata and finite tree automata for determining the truth 
status of formulae in WSIS and WS2S, respectively. Independently of MONA, Glenn and 
Gasarch [16] also implemented an automaton-based decision procedure for WSIS. 
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MONA was actively developed for six years, but since 2002 no further progress of the 
tool has appeared and only bugfixes have been applied. However, Klarlund et al iJJTj boldly 
claim that the developers of MONA tried many approaches to deal with common problems 
and tuned the tool to give the best performance. The most important feature mentioned 
is symbolic representation of transition functions of automata by multi-terminal binary 
decision diagrams (MTBDD), which are a generalisation of reduced ordered binary decision 
diagrams (ROBDD, often abbreviated just as BDD, see [ISj for further details). The idea 
of generalising HDDs to MTBDDs is by assigning multiple values to the sink nodes of the 
diagram (i.e. generalising function / represented by BDD, / : {0,1}" — t- {0,1}, to function 
g represented by MTBDD, g : {0, 1}" — t- D, where D is an arbitrary domain such that it 
contains bottom element _L E D). 

Due to the fact that BDDs are only a compact representation of formulae in propo- 
sitional logic with Boolean variables xi, . . . Boolean formulae can be used for their 
description. A BDD / : {0, 1}" — t- {0, 1} maps to the Boolean formula 

^ I • Xi • /(ai, . . . ,a„) I . (3.1) 

(ai,...,a„)e{0,l}" \ai=0 ai=l ) 

The mapping for MTBDDs is analogous, however a few preconditions need to be imposed 
on domain D: 

[i) the product of x e {0, 1} and d G D is defined as 

dcf / -L if X = , . 

^•^=1 d if x = l ' (^-^^ 

{ii) the addition operation on D needs to ensure that for d € D it holds that 

(i + _L = _L + (i = (i. (3.3) 

Then we define the mapping from MTBDD g : {0, 1}" — J- D to the Boolean formula 

^ I Jl • JJ Xj • 3(ai, . . . ,a„) j . (3.4) 

(ai,...,a„)e{0,l}" \ai=0 ai=\ j 



Example 3. This example shows in Figure 3.1 the structure of the following decision 
diagrams: 

a) a BDD representing formula: 

Xi-iXs + -■XiX2-'X3 + -■X1-1X2X3 , (3.5) 

b) an MTBDD representing formula: 

-iXi-iX2-'X3^ + XiXs-B + -1X1X2X35 . (3-6) 

Note that, e.g., the expression X1X3 represents the expression xi(x2 + ^X2)xj, which fully 
expands to X1X2X3 + X1-1X2X3. 
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a) BDD b) MTBDD 

Figure 3.1: Examples of the structure of BDD and MTBDD. 



When MTBDDs are used for transition table representation, every symbol of the in- 
put alphabet S is assigned a binary string, i.e. there exists an encoding function enc : 
S — )■ {0, 1}", where n = [lg|S|]. The values of sink nodes (set D from previously men- 
tioned function g) is the state set of the automaton. Such an MTBDD can be used to 
describe the transition relation of the automaton for a single state: the input symbol is 
encoded by function enc into a sequence of binary digits (xi, . . . , where xi, . . . ,Xn cor- 
respond to the Boolean variables of the MTBDD. The assignment to the variables denotes 
the path that is to be taken in the diagram and determines the sink node (i.e. the next state 
of the automaton). Such an MTBDD may either exist for every state of the automaton, 
or preferably shared MTBDD is used. This is another generalisation which merges all dia- 
grams into a single one with multiple root nodes (each corresponding to a different state) 
and changes the tree-like structure of an MTBDD into a directed acyclic graph (DAG). 
This solution yields a compact representation of the transition function even for large input 
alphabets. 

Another concept introduced by MONA developers is guided tree automaton [19j , which 
is supposed to tackle state space blow-up. Bottom-up tree automata often suffer from the 
problems of the way they work: while the automaton traverses the tree from its leaves to 
the root, it does not have any information about the position in the tree. The guided tree 
automaton provides a guide, an additional top-down tree automaton that labels tree nodes 
by assigning state spaces to them, making them aware of their position in the tree. This 
assignment is done before the actual automaton starts working. When it does, it operates 
faster, since every state space has its own state set and transition table. The guide needs 
to be either provided by the programmer, or it can be synthesized automatically for certain 
domains (e.g. WSRT logic used for description of recursive data types, which is implemented 
in MONA). 

Another noteworthy optimization applied in MONA is so-called eager minimisation: 
whenever the structure of an automaton is modified, a Myhill-Nerode minimisation is per- 
formed. Although originally not expected, this strategy yields very good results. Despite 
the fact that formulae are often represented in the form of trees (at least during syntactic 
analysis of the formula), MONA uses DAGs for their representation. Common subexpres- 
sions are identified and collapsed, thus saving both space and time. 
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Although MONA only supports work with deterministic finite (tree) automata, there are 
formal verification techniques (such as [20^ ) that can efficiently work directly with nondeter- 
ministic finite (tree) automata, thus avoiding possible time and space exponential blow-up 
caused by automata determinisation. After the construction of a finite tree automaton, 
MONA tries to find both a satisfying example and a counterexample. Therefore there is 
no efficient support for sophisticated manipulation with automata which may be required 
by some verification methods. 



3.3 Other Libraries 

Java library Lethal [21] supports numerous operations on tree automata, like checking 
whether some properties (determinism, completeness, . . . ) hold for a given automaton, or 
standard operations on languages (such as union, intersection, complement or difference). 
The implementation appears to be quite naive, with a primary focus on education. However, 
as the only studied library. Lethal also implements tree transducers and hedge automata 
(a modification of tree automata for unranked trees). 

Binary Tree Automata Library [22] is a Caml library for tree automata. The implemen- 



tation provides only basic functions and is close to Timbuk (see section 3.1), although it 
uses hash tables for a transition table representation and language-provided sets for state 
sets. 

A simple implementation of a tree automata library in ELAN can be found in [23^ . Also 
this library provides only basic functionality. 
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Chapter 4 

Analysis 



This chapter starts with an introduction to abstract regular tree model checking and an 
analysis of potential use of the library and collects requirements for the library. 

4.1 Abstract Regular Tree Model Checking 

The basic idea of regular tree model checking is to decide the emptiness of the language 

T* {C{Init))r\C{Bad), (4.1) 

where Init is a tree automaton denoting the set of initial states of the system, Bad is a 
tree automaton expressing the set of states violating the safety properties of the system, 
and r is a linear tree transducer representing the transition relation of the system. Because 
an iterative computation of r* (£(Init)) may not terminate, several acceleration methods 
have been proposed. One of them is abstract regular tree model checking [24], which is 
an acceleration technique based on the abstract- check-refine paradigm. Abstraction a is a 
function from the set of all tree automata Mjr over ranked alphabet T to its subset Aj^, 
At^^Mt^ such that VM G : C{M) C C{a{M)). 

4.1.1 Abstraction Based on Languages of Finite Height 

Abstraction based on languages of finite height, which was introduced in ^4], defines two 
states of a tree automaton as equivalent if their languages up to a given height n are 
identical. The implementation can be done similar to the Myhill-Nerode minimisation, 
except that the procedure stops after n iterations. 

4.1.2 Abstraction Based on Predicate Languages 

Given a set of predicate tree automata V = {Pi, . . . , Pn), abstraction based on predicate 
languages (introduced also in [24]) defines two states of a tree automaton equivalent if 
their languages have a nonempty intersection with exactly the same subset of languages 
represented by tree automata from V. This can be done by labeling every state with 
predicates that have a nonempty intersection with the language of the automaton and 
collapsing states with identical labeling. 
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4.2 Requirements 

An important requirement for the library is to enable a direct work with nondeterministic 
tree automata without determinising the automaton first. This is convenient for avoiding 
state explosion connected with automaton determinisation in some verification techniques 
(see [20]). The following standard operations are necessary to be implemented in the library: 

• creating a finite tree automaton denoting the union of languages of given finite tree 
automata, 

• creating a finite tree automaton denoting the intersection of languages of given finite 
tree automata, 

• creating a finite tree automaton denoting the complement of the language of a given 
finite tree automaton, 

• determinisation of a finite tree automaton, 

• minimisation of a finite tree automaton, 

• determining emptiness of language of a finite tree automaton, 

• reducing the size of a given nondeterministic finite tree automaton without determin- 
isation, and 

• determining inclusion of languages of given finite tree automata while avoiding deter- 
minisation of any automaton. 

The library also needs to implement tree transducers at least in their structure-preserving 
form. Certain techniques |24| [25l [26l |271 [28] need to efficiently traverse all states of the 
automaton in order to, for instance, compute the abstraction of the automaton. Support 
for this is also necessary. 
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Chapter 5 

Design 



This chapter starts with a description of the representation that we propose for the transi- 
tion function of nondeterministic finite tree automata. This is followed by a description of 
algorithms for operations on finite tree automata that use this representation. Relabelling 
tree transducers and operations with them are described at the end of the chapter. 

5.1 Representation of a Finite Tree Automaton 

The performance of operations on a nondeterministic finite tree automaton A= {Q,J-,Qf,A) 
is mostly affected by the choice of the data structure for representing the transition function 
A. Two major approaches are possible: 

Explicit representation This approach represents the transition function of a tree au- 
tomaton by enumerating all transitions in a data structure used for a representation 
of the set. 

Symbolic representation This method is a popular approach in model checking that is 
based on a representation of the transition function using Boolean formulae. The 
exact form of the representation varies depending on the application, however the 
most popular data structure used for representing Boolean formulae is the BDD. 

The analysis in Chapter |3] showed that symbolic representation using MTBDDs is a very 
promising approach. Therefore we chose MTBDDs for representation of transition function. 
To recap, MTBDD is a data structure that stores mapping g : {0, 1}" — t- D, where D is an 
arbitrary set. 

Our design attempts to tackle the problem of large alphabets by using a shared MTBDD 
such that the domain of the MTBDD, i.e. the sequence of Boolean variables {0, 1}", repre- 
sents binary encodings of symbols from according to some encoding function enc : J- — t- 
{0, 1}", where n < \lg | (note that n may be smaller than [Ig | because when arity 
of symbols is implicit, more symbols with different arity may map to one assignment of 
Boolean variables; in conflicting cases, we will denote symbol / G J-'p as fp). Using function 
enc to encode symbols from J^, MTBDD may represent function g : ^ D. Before we 
proceed, let us first define the set of super-states S{A) of the transition function A as 

S{A) = {iqi,...,qp)\p>0, 

f{qi,...,qp)^DGA,feTp,DCQ,D^(ll} (5.1) 
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or in case of a complete automaton with a sink state Qsink- 



S{A) = {{q^,...,qp)\p>0, 

f{qi, ...,qp)^DeAJeTp,DCQ,D^ {qs^nk}} ■ (5.2) 

Let ^^(A) be the set of super-states of A of arity n. Note that an empty sequence () 
represents initial super-state, i.e. the super-state from which transitions over leaf nodes are 
possible. We also extend the definition of membership relation G to super-states in the 
following way: 

q e {qi, . . . , Qn) < i < n : q = Qi (5.3) 



Using the previous definition of super-states and definition of A (see Equation 2.2), we 
may alternatively define the transition function of an automaton ^ as a mapping A* in the 
following way: 



A* : S ^ ( J- - 

(gi, . . . , gp) ^ {(/, D) I /(gi, ...,qp)^DGA} (5.4) 

This means that we may represent the transition function A of a tree automaton ^ as a 
data structure that associates each super-state with an MTBDD that is indexed using a 
binary encoding of symbols from J-" and has subsets of Q in its sink nodes. When shared 
MTBDD is used, each super-state is mapped to a root of a given MTBDD. In case A* is 
not total, we make it total by assigning MTBDD where all symbols map to a sink state 
{qsink} to each super-state that has no image in A*, which yields a complete automaton. 
We further confuse A and A*. 



Example 4. Consider the nondeterministic finite tree automaton A = {Q,J-, Qf, A), Q = 
{91,92,93}, = {a,bo,b2,Co,ci,di}, and A = {60 ^ {9i,92},Co {92},c^i(92) {93}, 
^2(91, 93) — ^ {91, 92}, £1(53) — )• {91,92}} (Qf is not important at this point). A shared 
MTBDD corresponding to A is in Figure [STT] 




Figure 5.1: A representation of A by a shared MTBDD. Encoding of symbols from a: 
00, b: 01, c: 10, d: 11. Dashed lines represent value of given variable, solid lines represent 
value 1. 
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5.2 Operations on MTBDDs 



This section describes algorithms that perform operations on finite tree automata with a 
transition function represented by an MTBDD. These algorithms manipulate MTBDDs 
using the following two sufficient functions: 

Apply The standard Apply function that performs a given binary operation op on all 
respective sink nodes (this means sink nodes accessible over the same symbol) of 
two input MTBDDs (ihs and rhs for left-hand side MTBDD and right-hand side 
MTBDD respectively) and returns the resulting MTBDD. 

Apply : (J" ^ 2*3) ^ (J" ^ 2'^) (2'3 ^ 2'^ ^ 2^) ^ ( J" ^ 2'^) 

Apply Ihs rhs op = Ax . op (ihs x) (rhs x) (5-5) 

MonadicApply The monadic version of Apply function that performs a given unary op- 
eration op on all sink nodes of input MTBDD tf and returns the resulting MTBDD. 

MonadicApply : {T ^ 2^) ^ {2^ ^ 2^) ^ {T ^ 2^^) 

MonadicApply tf op = Ax . op (tf x) (5-6) 

Note that A-calculus is used for definitions and applications of functions that work with 
MTBDDs in order to make them more comprehensible. In case the result of the Apply 
operation is not stored (the operation is performed solely for the side effect of op) , op does 
not need to return a value. Further, we assume that transition functions for all automata 
are stored in a single shared MTBDD. 

5.2.1 Insertion of a Transition 

Inserting a transition into an MTBDD-represented transition function of finite tree au- 
tomaton A = {Q,J~, Qf, A) is done by creating a new MTBDD with only given transition 
and merging it with the original MTBDD representing A by substituting the sink node at 
position given by the symbol of the transition with the new value as described in Algo- 
rithm [7j In order to create a new MTBDD with a given transition, an additional function 
is necessary: 

CreateMTBDD This function creates an MTBDD which maps a single symbol to a single 
set of states. 

CreateMTBDD : F ^2^^ ^[F ^2^ ) 

CreateMTBDD k D = Ax . if x = k then D else {qsink] (5.7) 

Algorithm 1: Transition insertion 
Input: Transition function A/^r 

Transition f{qi,---,qn) — s- -D to be inserted 
Output: AouT = (A/TV \ {f{qi, . . . ,qn) ^ E \ E <Z Q}) U {f{qi, ...,qn)^D} 
1 begin 

tmp := CreateMTBDD f D; 
sp := (gi,...,g„); 
^OUT ■= A/7v; 

AouT sp := Apply (A/at sp) tmp {XX Y . if Y = {qsink} then X else Y); 
return Aqut] 



2 

3 

4 
5 
6 

7 end 



22 



5.2.2 Retrieval of a Transition 

The algorithm that retrieves a transition (i.e. for a given super-state and a 

symbol / returns D such that — )• D S A) from an MTBDD- represented 
transition function A first creates a projection BDD and makes a projection of the MTBDD 
representing the transition function for given super-state according to given symbol of the 
input alphabet. A projection BDD )-{0,l}isa BDD over the same set of Boolean 

variables as the transition function MTBDD which identifies the nodes that are to be 
excluded from the MTBDD with value and the others with value 1. After the projection 
is done, MonadicApply collects the sink nodes of the resulting MTBDD. The algorithm, 
which is described in Algorithm [7j needs the following two additional functions for working 
with projection BDDs: 

Great eProject ion This function creates a projection BDD for symbol k. 

CreateProjection : — )• (J^ — )• {0, 1}) 

CreateProjection k = Ax . if x = k then 1 else (5.8) 

Project Makes a projection of MTBDD Ihs using a projection BDD rhs and returns the 
resulting MTBDD. 

Project : {F 2'^) ^ {F ^ {0, 1}) ^ {F ^ 2^) 

Project Ihs rhs = Ax . if (rhs x) = 1 then (ihs x) else {qsink} (5.9) 



Algorithm 2: Transition retrieval 



Input: Transition function A 

Symbol / and super-state [qi, . . . ,q. 
Output: D = {E\ f{qi, ...,qn) ^ E € A} 
1 begin 



2 
3 
4 
5 
6 

7 end 



states := 0; 

tmp := CreateProjection /; 
proj := Project (A {qi,..., g„)) tmp; 
MonadicApply proj (collect states); 
return states; 



Function collect(states, leaf) 

1 begin 

2 I states := states U leaf; 

3 end 
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5.2.3 Language Union 



The task of the operation of language union is, for two input tree automata = (Qi ,J^,Qfi, 
and A2 = {Q2,J^,Qf2, ^2), to create a tree automaton = {Qu,J^,Qfu,^u) such that 
C{Au) = J~-{Ai) U £(^2)- Although the algorithm presented in Section 2.3.1 preserves 
determinism, we chose to use a more simple approach that does not create a product au- 
tomaton but rather reuses transition functions of input automata as much as possible (and 
may introduce nondeterminism when input automata are deterministic). 

The idea of this construction is to create such an automaton that makes nondeterministic 
transitions over leaf symbols to either Ai or A2 and then continues its run in the target 
automaton. Assume without loss of generality that Qi Ci Q2 = 0, then Qu = Qi U Q2, 
Qfu = Qfi U Q/2, and 



Au = (Ai\{/^Z?i I /e J-o,Di CQi})u 
{A2\{f ^ D2\ f eTo,D2CQ^})U 
{f^D\f€To,D = DiUD2,f^Di€Ai,f 



D2 £ A2} . (5.10) 



Computations of Qij and Q f\j are trivial. Since all transitions are stored in a single MTBDD 
the computation of Aj needs only one Apply operation: 



Au := Apply (Ai ()) (A2 ()) (AX Y .XUY) 



(5.11) 



Figure |5.2| shows the process of construction of the transition function for the union 

0, 



automaton. The procedure is described in Algorithm [7| 




a() U a() 

A1 A2 




^(lU c() 

A1 A2 



'^(^U b() 

A1 A2 



Figure 5.2: Construction of automaton such that C{Aij) = C^Ai) U C{A2)- 



5.2.4 Language Intersection 

The requirements on the language intersection operation are very similar to language union: 
given two finite tree automata Ai = {Qi,J-, Qfi, Ai) and A2 = {Q2,J^, Qf2, A2) construct 
a finite tree automaton = {Qn-,^-, Qfn^ An) such that C{Ar,) = C{Ai) n C{A2)- 

The construction is done by creating a product automaton (a tree automaton with state 
set that is the Cartesian product of state sets of input automata) which simulates 
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Algorithm 2: Union automaton construction 



Input: Input automata = {Qi,F,Qfi,IS.i) and A2 = {Q2,J^,Qf2, ^2) 
Output: = (Qu,-7^,(5/u, Au) such that C{Au) = C{Ai) U C{A2) 

1 begin 

2 Qu ■■= Qi u Q2; 

3 Qfu := Qfi U Qf2; 

4 Au := AiU A2; 

5 Au := Apply (Ai ()) (A2 ()) (AX Y .XUY); 

6 return = (Qu,-^,Q/u,Au); 



7 end 



parallel run of both input automata: 

■An = {Qi X (52,-^,(3/1 X (5/2, An) 

where 

An = {/((gil,g2l),- • • , (gin,g2n)) (91,^2) I / e 

/(gii, . . . ^ ft € Ai,f{q2i,...,q2n) ^ 92 e A2} 



(5.12) 



(5.13) 



such that contains only reachable states and transitions. Detection of reachable states 
is done by starting from initial super-states of automata, analysing all transitions from 
reachable super-states and collecting states that may be reached in this way until the 
algorithm has no unanalysed state. A super-state (gi, . . . , g„) is reachable if VI < i < n : 
is reachable. Due to the fact that we work with complete automata (with sink state qsmk ^ 
Qf), whenever we reach a product state {qi,qsink) or {qsink^Q2)i we may stop generating 
further states (this is because qsink lias only transitions to qsink so no accepting state can 



be reached from such state). Figure 5.3 shows the first step of construction of the product 



automaton. The construction process is described in Algorithm 25 







A2 



Figure 5.3: Construction of automaton such that C{Ar]) = C{Ai) H £(^2)- 
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Algorithm 3: Intersection automaton construction 



Input: Input automata = {Qi,F,Qfi,IS.i) and A2 = {Q2,J^,Qf2, ^2) 
Output: = (Qn,-7^,(5/n, An) such that C{An) = C{Ai) D C{A2) 
1 begin 

Qn ■■= Qfn ■■= An := 0; 
newStates := empty queue; 

An := Apply (Ai ()) (A2 ()) (intersect newStates); 
while newStates is not empty do 
{qa,Qb) '■= newStates. dequeue(); 
if {qa,qb) ^ Qn then 

Qn ■= Qn U {{qa,qb)h 
if qa = qsink y qb = qsink then continue; 
if qa G Qfi f\qb^ Qf2 then Q/n := Qfn U {{qa,qb)}; 
foreach n G N such that S'n(Ai) 7^ A 5n(A2) / do 

foreach {qn, qin) G 5„(Ai) such that qa G {qn, qin) do 
foreach (921, • • • , q2n) G 5'„(A2) such that qb G (921, ■ ■ • , q2n) do 
if VI < i < n : {qu, q2i) G Qn then 
spl := (gii, . . . ,gi„); 
sp2 := (^21, • • • ,g2n); 
An ((gii,92i), • • • , (gin,g2n)) := 
Apply (Ai spl) (A2 sp2) (intersect newStates); 
end if 
end foreach 
end foreach 
end foreach 
end if 
end while 

return ^n = (Qn, Qfn, An); 



25 end 



Function intersect (newStates, Ihs, rhs) 



1 begin 



productSet := Ihs x rhs; 
foreach {qa,qb) G productSet do 
I newStates. eiLqvLene{{qa, qb))] 
end foreach 
return productSet; 



7 end 



5.2.5 Determinisation 

The determinisation operation takes an input finite tree automaton A = {Q, T , Qf, A) and 
transforms it into a deterministic finite tree automaton Ad = {Qd, J~,Q fd, ^d) such that 
£{Ad)=£{A). 

The determinisation algorithm that is described in Algorithm 24 works with macrostates. 
A macrostate M C Q is a state in the deterministic automaton that represents all states 
which might have been accessed during the run of the nondeterministic automaton over 
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the same sequence of symbols. The algorithm starts from the initial super-state, creates a 
new macrostate for each sink node of the MTBDD for the initial super-state and proceeds 
with finding all super-states {qi, - ■ ■ ,qn) such that there exist macrostates Mi, . . . , M„ : 
VI < i < n : G Mi. For each such super-state an MTBDD with union of sets at sink 
nodes of all MTBDDs that can be accessed by combinations of states in given macrostates 
is created; new macrostates are retrieved as sets of states from sink node of this MTBDD. 
This guarantees that only reachable states are present in the result automaton. 

Algorithm 3: Automaton dctcrminisation 

Input: Input automaton A = {Q,T,Qf,A) 

Output: Deterministic automaton Ad = {Qd-,J^,Qfd, ^d), ^^(.4^) = JC-{A) 
1 begin 

Qd ■■= Qfd ■■= Ad := 0; 
newStates := empty queue; 

Ad := MonadicApply (A ()) (collectSets newStates); 
w^hile newStates is not empty do 
s := newStates.dequeue(); 
if s ^ Qd then 

Qd ■■= Qd U {s}; 

if 3qf G s such that qf & Qf then Qfd := Qfd^ {s}; 
foreach n G N such that Sn{A) 7^ do 

foreach (gi, . . . , g„) G S'n(A) such that 31 < i < n : q^ e s do 
foreach si, . . . , s„ G such that G si, € s„, Si = s 

do 

/* Create empty MTBDD */ 

tmp := 0; 

foreach (pi, . . . G 5'„(A) such that G si, . . . ,p„ G Sn 
do 

I tmp := Apply tmp (A (pi, . . . {XX Y . X U Y); 

end foreach 

Ad (Si, ... ,Sn) ■■ = 

MonadicApply tmp (collectSets newStates); 
end foreach 
end foreach 
end foreach 
end if 
end while 

return Ad = {Qd,J^,Qfd,Ad); 
24 end 



Function collectSets(newStates, tf) 



1 begin 



newStates . enqueue (tf); 
return {tf}; 



4 end 
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5.2.6 Language Complementation 



Given a finite tree automaton A = {Q,J-,Qf,A), the task of language complementation is to 
construct automaton Ac such that C{Ac) = C{A). This is done by first transforming ^ to a 
deterministic automaton Ad = {Qd, J^, Qfd, ^d) by the procedure described in Section 5.2.5 
and then complementing the set of accepting states: Ac = {Qd-,^-, Qd \ Qfd-, ^d)- 



5.2.7 Automaton Reduction 

Reduction of a finite tree automaton is a generic operation that takes a finite tree automaton 
A = {Q,J-,Qf,A) and a quotient set Q/ ^ oi some equivalence relation ~ and returns a 
reduced finite tree automaton Ar = {Qr^J'^Qfr^A.r) such that Qr = Q/ ~, Qfr = {D G 
Qr \ & D : q ^ Q f} , and 



A, = {f{Bi,...,Bn)^ B\ 

f{qi, . . . , g„) g G A, / G J", gi G 5i, . . . , g„ G g G 5} 



(5.14) 



Various methods can be used for obtaining the equivalence relation ~, e.g. Myhill- 



Nerode minimisation (see Section 5.2.9) or downward simulation (see Section 5.2.11). Note 



that while the former approach can be used over deterministic finite tree automata only, 
the latter may be used to reduce the size of nondeterministic finite tree automata as well 
(in their case, however, the result is not a minimal nondeterministic finite tree automaton 
but reduced nondeterministic finite tree automaton only). The algorithm for reduction of 



a finite tree automaton is given in Algorithm 12 



Algorithm 3: Automaton reduction 



Input: Input automaton A = {Q,J-,Qf,A) 
Quotient set 

Output: Reduced automaton Ar = {Qr,J~,Qfrj^T 
1 begin 



2 

3 

4 

5 

6 

7 

8 

9 
10 
11 

12 end 



Qr • — 

Qfr ■■= I q e Qf}; 

Ar := 0; 

foreach n G N such that Sn{A) / do 
foreach {qi, . . . , g„) G ^^(A) do 
sp := ([gi]^,...,[g„]^); 
A,, sp := Apply {Ar sp) (A {qi,.. 
end foreach 
end foreach 

return Ar = {Qr,J^,Qfr,Ar); 



,qn)) iXXY.XU{[yU\yGY}y, 



5.2.8 Pruning Unreachable States 

The task of pruning unreachable states of a finite tree automaton A = {Q,T,Qf,A) is 
removal of states q (and corresponding transitions, which means removing MTBDDs for 
all super-states that contain q) for which there does not exist a tree t G T(J^) such that 
t — q. The algorithm attempts to simulate the run of the automaton for all possible trees 



and collect states that can be reached. The description of the algorithm is in Algorithm 21 
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Algorithm 4: Unreachable states pruning 



Input: Input automaton A = {Q,J-,Qf,A) 

Output: Automaton Ap = {Qp,T,Qfp,Ap) without unreachable states, such 
that C{Ap) = C{A) 

1 begin 



Qp '■ — Qf p 
reachStates 



Ap := 0; 
empty queue; 



Ap := MonadicApply (A ()) (collectReachable reachStates); 
while reachStates is not empty do 
q := reachStates. dequeue(); 
if g ^ Qp then 

Qp ■■= Qp^{q}; 

if q e Qf then Q/p := Qfp U {q}; 
foreach n G N such that ^^(A) / do 

foreach (gi, . . . , qn) £ ^^(A) such that q £ (gi, . . . , g„) do 
if VI < i < n : G Qp then 
sp := (gi,...,g„); 
Ap sp := 

MonadicApply (A sp) (collectReachable reachStates); 
end if 
end foreach 
end foreach 
end if 
end while 

return Ap = {Qp,J^,Qfp,Ap); 



21 end 



Function collectReachable(reachStates, leaf) 

1 begin 

2 foreach q G leaf do 

3 I reachStates. eiLqp.ene{q); 

4 end foreach 

5 return leaf; 

6 end 



5.2.9 Minimisation 

Automaton minimisation is an operation on a finite tree automaton A = {Q,J-,Qf,A) 
which returns deterministic finite tree automaton Am = iQm,J^,Qfm, Am) such that 
C{A) = C{Am) and A m is an automaton that has the least states from all determinis- 
tic finite tree automata that accept C{A). Existence of a minimum deterministic finite tree 



automaton is guaranteed by the proof of Myhill-Nerode Theorem (see Section 2.4). 



The minimisation process starts with pruning unreachable states and determinising the 



input automaton. Once done, Algorithm 25 computes equivalence relation for congruence 



(used in Myhill-Nerode Theorem) ~ on Q. This is done by refining the equivalence relation 
from the start point with two initial classes: the accepting states and the non-accepting 
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states. All super-states {qi, . . . , qn) are then searched and in case there exists an equivalence 
class [qi]^ such that when qi is substituted in (gi, . . . , g„) for some other element from [g^]^ 
and the target class of transitions over respective symbols differs, then ~ is refined. 

In the following step, the quotient set of ~ is passed to the reduction procedure (see 



Section 5.2.7) and obtaining the minimum automaton is straightforward. 



Algorithm 4: Computation of ~ equivalence over states 



(g,j-,Q/,A) 



Input: Deterministic automaton without unreachable states A 
Output: Equivalence relation Q x Q 

1 begin 

2 eq := {{p, q)\p(^Qf <^q€ Q/}; 

3 prevEq := 0; 

4 while eq / prevEq do 

5 prevEq := eq; 

6 foreach n G N such that 5„(A) / do 

7 foreach (gi, . . • , gn) e 5'„(A) do 

8 foreach 1 < i < n do 

9 foreach q E [(^i] prevEq 

do 

10 spgj := {qi,...,qi-i,qi,qi+i,...,qn); 

11 spg := {qi,...,qi^i,q,qi+i,...,qn); 

12 if sp^ G 5'„(A) then 

13 refined := false; 

14 Apply (A spgj) (A spg) (ref ineEq prevEq refined); 

15 if refined then eq := eq\{{q,qi),{qi,q)}; 

16 else 

17 I eq := eq\{{q,qi),iqi,q)}; 

18 end if 

19 end foreach 

20 end foreach 

21 end foreach 

22 end foreach 

23 end while 

24 return ~= eq; 

25 end 



Function refineEq(prevEq, refined, {Ihs}, {rhs}) 



1 begin 



2 

3 

4 

5 end 



if [lhs]prevEq / [rhs]prevEq then 

I refined := true; 
end if 



5.2.10 Checking Language Emptiness 

The problem of determining emptiness of a language is defined as given a finite tree automa- 
ton A = {Q,J-',Qf,A), is C{A) = 0? The algorithm for deciding language emptiness first 



removes unreachable states from automaton A using the method described in Section 5.2.8 



This constructs a finite tree automaton Ap = {Qp,T, Qfp, Ap) without unreachable states. 
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It holds that language C{Ap) is empty if and only if Q/p = (i.e. there is no reachable final 
state in Ap). Note that a slightly more efficient algorithm can determine that C{A) 7^ 
immediately when the analysis of reachable states reaches state q such that q E Qf. 

5.2.11 Downward Simulation Reduction 

Downward simulation [26] ^ for a finite tree automaton A = {Q,J-,Qf,A) is a binary 
relation on Q such that if q ^ r and f{qi, ■ ■ ■ , q-n) — ?• (7 G A, then there are , . . . , such 
that /(ri, . . . , r„) — t- r G A and qi < ri for each \ < i < n. Formally: 

V/G-F : ^ r A/(gi,...,gn) ^ g G A] ^ 

[3ri,...,r„ E Q : /(ri,...,r„) ^ r G A A VI < i < n : ^ r^] (5.15) 

From the previous equation, the following can be inferred using modus tollens: 

'ifeF : ^ [3ri,. . .,r„ E Q : /(ri,... ,r„) ^ r G A A VI < i < n : ^ ri] ^ 

h ((7 ^ r) V - (/((?l, ...,qn)^qGA)] (5.16) 

We further expand ^ relation to super-states: 

(gi, . • • ,gn) ^ (n, . • • ,r„) < i < n : g'i ^ Tj (5.17) 

It can be proved that ^ is refiexive and transitive. It is possible to use downward simulation 
for reduction of the size of an automaton by identifying states that simulate each other and 
collapsing those states together. Even though an automaton obtained in this way is often 
not minimum, the reduction can be significant and computation is faster than minimisation 
which needs to first convert the automaton to deterministic one. 



The algorithm for computation of downward simulation, described in Algorithm 17 
starts with declaring ^= Q x Q and then for each super-state {qi, . . . , qn) finds all super- 
states (ri, . . . , r„) such that (gi, . . . , g„) ■< (ri, . . . , r„) and makes a new MTBDD with 
uniting the sink nodes of those. This union MTBDD represents all states r that can be 
reached using super-states (ri,...,r„) simulating super-state (gi,...,g„). Now for each 
state q accessible from (gi, . . . , g„) over symbol f £ J- we check for each r such that q ^ r 



that r is in the union MTBDD accessible over /. In case it is not, according to Equation 5.16 
the simulation relation < can be is refined by removing (q, r) from <. This is repeated until 
^ reaches the fixpoint. 

As downward simulation is refiexive and transitive but generally not symmetric, sym- 
metric closure of the relation needs to be performed in order to obtain equivalence relation. 
Reduction is then performed using the generic reduction procedure as described in Sec- 
tion ESS 



5.2.12 Checking Language Inclusion Using Antichains 

The language inclusion decision problem is to determine for two input finite tree automata 
Ai = (Qi, Ai) and A2 = (Q2, -7^, Q/2, A2) whether it holds that £(^1) C £(^2). 

The standard approach of checking language inclusion is by determinising A2 , complement- 
ing it, and checking whether C{Ai) H C{A2) = 0. In case the intersection is not empty, it 
means that there are some trees which are in C{Ai) and not in £(^2) and therefore the 
inclusion does not hold. Nevertheless complementation needs determinisation of A2 which 
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Algorithm 4: Downward simulation computation 



Input: Input automaton A = {Q,J-,Qf,A) 
Output: Simulation relation Q x Q 
1 begin 

prevSim := 0; 
sim := Q X Q; 
while prevSim ^ sim do 
prevSim := sim; 

fo reach n G N such that 5„(A) 7^ do 
foreach (gi, . . . , g„) G ^^(A) do 

/* Create empty MTBDD */ 
tmp := 0; 

foreach (ri, . . . , r„) G S'n(A) such that VI < i < n : {qi,ri) £ sim 
do 

I tmp := Apply tmp (A (n, . . . , r„)) {XX Y . XUY); 
end foreach 

Apply (A {qi, . . . ,qn)) tmp (simulationRef inement sim); 
end foreach 
end foreach 
end while 
return ~<= sim; 



17 end 



is often very expensive. Therefore it is desirable to find approaches that do not need this 
operation. 

One approach that avoids determinisation is based on antichains [20]. An antichain 
over Qi x 2*^^ jg g, set S Q Qi x 2*^^ such that for every (p, s), (p', s') G 5 if p = p' then 
s s'. For (p, s) £ S, p denotes a state from ^1 that is reachable over some tree and s 
denotes a set of states of automaton A2 that are reachable over the same tree. If such a pair 
{p,s) can be reached so that p G Qfi and \/r G s : r ^ Qf2i the inclusion C{Ai) C £(^2) 
does not hold. The algorithm is given in Algorithm [21} 

5.3 Transducers 

This section starts with a description of the representation of relabelling (or sometimes 
called structure-preserving) tree transducers. These are transducers that do not change the 
structure of input trees but only change symbols in their nodes. The section continues by a 
definition of two operations that are necessary in regular tree model checking: performing 
a transduction step on a finite tree automaton and composition of transducers. 

5.3.1 Representation of a Relabelling Tree Transducer 

We represent only relabelling tree transducers that use the same alphabet for both 
input and output, we therefore refer to transducer r = {Q,F^F' = J^,Qf,A) by r = 
{Q,J-,Qf, A). Relabelling tree transducers contain transduction rules of the following type: 

f{qi{xi), qn{xn)) q{g{xi, (5.18) 
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Function simulationRefinement(sim, Ihs, rhs) 



begin 

foreach q G Ihs do 

foreach r such that {q, r) G sim do 

if r ^ rhs then sim := sim \ {(g, r)}; 
end foreach 
end foreach 
end 



where n £ N, f,g £ J^n, Q,Qi, ■ ■ ■ ,Qn £ Q and xi,...,Xn G X, or using an alternative 
notation as 

f{qi,...,qn)^q{9)- (5.19) 

The representation of a transduction function A of a relabelling tree transducer is 
therefore very similar to the representation of a transition function of a finite tree automaton 
and can again be symbolic. We naturally expand the definition of a super-state ^(A) to 
the transduction function. The transduction function A of a relabelling tree transducer r 
may then be alternatively defined as a mapping A* in the following way: 

A* : 5 ^ (-F ^ (^ ^ 2'3)) 

{qi,...,qp)^ {(/, {g, D)) \ D = {q \ f{qi, ...,qp)^ q{g) G A}} . (5.20) 



However, since the composition of functions is associative, the formula in Equation 5.20 
can be rewritten as 

A* : S ^ {{T^T)^ 2^) 

{qu...,qp)^{i{f,g),D) \ D = {q \ f{q^, . . . , q^) ^ q{g) e A}} (5.21) 

(we again confuse A and A*). This means that we can represent a transduction function of 
a relabelling tree transducer using MTBDDs in the same way as a transition function of a 
finite tree automaton, provided we expand the function enc : T — t- {0, 1}", which is defined 



in Section 5.1 to ency : [T x ^ {0, 1} in the following way: 



encT : {T x T) ^ {0, 1}^" 

(a, b) i-> (ai, . . . , a„, 6i, . . . , 6n) (5.22) 
where (ai, . . . , On) = enc(a) and {bi, . . . ,bn) = enc{b). 

Note that the actual ordering of ai, On and bi, . . . ,bn is not important provided that it 
remains consistent for encx- Another ordering which may be useful for some cases is for 
instance (ai, 6i, . . . , a„, 6n)- 

In case we denote the MTBDD for a super-state of the transition function A_4 of a 
finite tree automaton A as 

E [X\^^i■W^^■SA{f)] (5.23) 

/e:^ \ai=0 ai = l / 

enc(f) — (ai,...,an) 
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Algorithm 4: Antichain-based inclusion 



Input: Input automata = (Qi, J^, Q/i, Ai) and A2 = {Q2,J^,Qf2'> ^2) 

Output: true if C{Ai) C C{A2), false otherwise 

begin 

prevAntichain := 0; 
antichain := 0; 

Apply (Ai 0) (A2 0) (collectProducts antichain); 
while antichain / prevAntichain do 
prevAntichain := antichain; 
foreach {q, D) € prevAntichain do 

if q € Q fi A\/p £ D : p ^ Q f2 then return false; 
end foreach 

foreach n G N such that 5„(Ai) / do 
foreach (qi, . . . , g„) £ >S'n(Ai) such that 
VI < f < n : 3Ri C Q2 : {qi, Ri) € prevAntichain do 
tmp := 0; 

foreach (si, . . . , s„) & Sn{^2) such that \/l < i < n : Si £ Ri do 

I tmp := Apply tmp (A2 (si, • • • , s„)) {XX Y . X U Y); 
end foreach 

Apply (Ai {qi, ■ ■ ■ ,qn)) tmp (collectProducts antichain); 
end foreach 
end foreach 
end while 
return true: 



21 end 



(see Section 3.2 for further details of this notation), we may represent MTBDD for a super- 



state Sr of the transduction function A,- of a relabelling tree transducer r as 

Yl \ll^xi■ll^^■ll^y^■lly^■sAf,g)] ■ (5.24) 

(f,a)eTxT \ai=0 ai=l bi=0 bi = l I 

(ai,...,an,hi,...,bn) 

This representation works with an MTBDD extended by Boolean variables yi, . . . ,yn 
(we assume that the representation of finite tree automata described in Section 5.1 uses 
variables xi, . . . ,Xn)- In order to support operations that work with both finite tree au- 
tomata and relabelling tree transducers, the following two functions that work directly with 
the structure of MTBDDs are necessary: 

TrimVariables This function receives an MTBDD M and x, which is a base name of 
Boolean variables xi,...,Xn such that xi,...,Xn are in M, and returns MTBDD 
M_a; that does not contain Hence, TrimVariables {M, x) = M^x- Since 

this operation may cause collisions (e.g. producing formula yiy2A + yiy2B where 
A 7^ B,A 7^ -L,-B 7^ _L), they need to be properly handled by uniting colliding state 
sets (e.g. producing yiy2{A U B) for the previous example). The following formula 
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Function collectProducts(antichain,lhs, rhs) 



begin 

foreach q G Ihs do 

if ${q,E) G antichain such that rhs C E then 

antichain := {antichain \ {{q, F) \ F G rhs}) U {(g, rhs)}; 
end if 
end foreach 
end 



formally defines the function: 



Trim Variables 



X 



(S,a)eTxF \ai=0 ai=l bi=0 bi=l 

encrp(f,g) = 

\(ai,...,an,bi,...,bn) ) 



= E n-y^-ny^- U'^(/'5) (5-25) 

96.^ \bi=0 bi=l feT J 

enc{g}=(bi,...,bn) 

The implementation of Trim Variables (M, x) can be done in the following way: 

Traverse M from the root to sink nodes and for each node A; on the path such 
that k represents some Xi do the following: take both child nodes of k, kg and 
ki, and set k to k := Apply ko ki (AX Y . XUY). 

Rename Variables A function that receives an MTBDD M and names of two Boolean 

variables x and y such that xi, . . . ,Xn are in M. The function renames all occurrences 
of Xi to yi for each 1 < i < n. The function is formally defined by the following 
formula: 



Rename Variables 



feT \ai=0 ai=l ) 



^ enc(f) = (ai,...,an) 



E n • n ■ j(f) (5-26) 



feT \bi=0 bi=l 

enc(f)=(bi,...,b„) 



The implementation of Rename Variables (M, x,y) simply traverses M and renames 
all occurrences of Xi to yi (assuming that yi, - ■ ■ ,yn are not in M). 

5.3.2 Performing a Transduction Step 

The operation of performing a transduction step of a finite tree automaton A = {Qj, , T, Q fA , ^a) 
according to the transduction denoted by a relabelling tree transducer r = {Qr, J', Qfn ^r) 
constructs a finite tree automaton Aw = {Qwi^-,Qfw,^w) such that C{Aw) = ''"('C(-^))- 
Informally, if A represents a set of configurations of a system and r represents transitions in 
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the system, then r (C{A)) is a finite tree automaton that represents the set of configurations 
of the system after one transition. 

Algorithm 4: Performing transduction step 

Input: Input automaton A = {Qa^^^QjAt^a) 

RelabeUing tree transducer r = {Qr,T, Qfrj^r) 
Output: Automaton Aw = {Qw-,^-, Qfw, ^w) such that C{Aw) = t {C{A)) 
1 begin 



2 

3 

4 
5 
6 
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9 
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11 
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14 
15 
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19 
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21 
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Qw • — Qfw ■ — ■ — 0) 

newStates := empty queue; 

tmp := Apply (A_4 ()) (A,- ()) (intersect newStates); 
tmp := TrimVariables {tmp, x); 
Aw := RenameVariables{tvr\p,y,x); 
while newStates is not empty do 
{<la,Qb) '■= newStates. dequeue(); 
if {qa,qb) ^ Qw then 

Qw ■■= Qw u {{qa,qb)}; 

if qa = qsink \/ qb = qsink then continue; 

if qa G QfA ^qb^ Qfr then Qfw := Qfw u {(ga,%)}; 
foreach n G N such that ^^(A^) 7^ A 5n(A^) / do 

foreach (gn, . . . , qin) G 5n(A^) such that qa G {qn, ■■■ , qin) do 
foreach (921, • • • , q2n) G 5'„(A^) such that qt £ (921, • • • , q2n) 
do 

if VI < i < n : {qu, q2i) G Qw then 
spl := {qii,...,qin); 
sp2 := (^21, • • • ,g2n); 
tmp : = 

Apply {Aa spl) (At- sp2) (intersect newStates); 

tmp := TrimVariables {tmp, x); 

Aw ((911,921), (gin, g2n)) : = 

RenameVariables {tmp, y, x); 
end if 
end foreach 
end foreach 
end foreach 
end if 
end while 

return Aw = {Qw, J^,Q fw, ^w)] 
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The algorithm for this operation is described in Algorithm 29 The algorithm assumes 
that MTBDDs for transition function A_4 are defined over Boolean variables xi, . . . ,Xn and 
that MTBDDs for transduction function At- are defined over Boolean variables x\, • • ■ , x^i 
and yi, . . . ,yn, where xi, . . . , x„ are used for input symbols of the transducer and yi, . . . , y„ 
are used for output symbols. MTBDDs for the output automaton are again over Boolean 
variables xi, . . . , x„. The algorithm works by traversing both the automaton and the trans- 
ducer in parallel and performing relabelling of transitions which are in both (the algorithm 
may resemble the computation of intersection, it actually uses function intersect () which 



is defined in Section 5.2.4). Figure 5.4 attempts to give the idea about how the algorithm 



36 



works for a pair of super-states. 





A3 ± 



B7 



a) MTBDD s_a 



b) MTBDD Sr 



c) res := Apply st intersect 





d) trimmed := Trim Variables {zes,x) 



e) RenameVariables{tziimed,y,x) 



Figure 5.4: An example of performing transduction step of transducer r on automaton A 
for one pair of super-states Sr and s^, such that Ol(s^) — )• A, lO(s^) — )• B, and OX(s^) — )• 
3(1X), IO(st-) — ?• 7(01). Ordering (ai, 61, . . . , a„, 6„) is used. 



5.3.3 Transducer Composition 

Transducer composition is an operation that, when given two relabeUing tree transducers 
Ti = (Qi, Q/i, Ai) and T2 = {Q2,J^,Qf2, '^2), creates a relabeUing tree transducer r = 
(Q, J-',Qf, A) such that for all finite tree automata A, it holds that r (£(^)) = r2 (ti (£(^))) 
(or r = T2 o n ) . 



The algorithm described as Algorithm 32 assumes that MTBDDs for transduction func- 
tions Ai and A2 are over Boolean variables xi, . . . ,Xn (which encode the input symbol) and 
yi, . . . ,yn (which encode the output symbol) . The MTBDDs for the transduction function 
of the constructed transducer are again over Boolean variables xi, . . . ,Xn for the input and 
yi, . . . ,yn for the output. However, the MTBDDs need to be able to also work with Boolean 
variables zi, . . . , z„ as they are used inside the algorithm. 

The algorithm is very similar to the algorithm that performs a transduction step on a 



finite tree automaton (see Section 5.3.2). Figure 5.5 gives an example of the operations 
carried out by the algorithm for one pair of super-states. 
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Algorithm 5: Transducer composition 



Input: Input transducers ri = (Qi, 7^, Qji, Ai) and T2 = {Q2,J^,Qf2,^2) 

Output: Transducer r = {Q, J^, Qf, A) such that t = T20 n 

begin 

Q := Qf := A := 0; 
newStates := empty queue; 



tmp 
tmp 

tmp 
tmp 

A() 



RenameVariables {A2 {),y,z); 

Rename Variables {tmp, x, y); 

Apply (Ai 0) tmp (intersect newStates); 

TrimVariables {tmp, y); 
■ RenameVariables {tmp, z,y); 



while newStates is not empty do 
{Qa,Qb) '■= newStates.dequeue(); 

if {Qa,qb) ^ Q then 
Q ■■= QU{{qa,qb)}; 

if Qa = Qsink Qb = Qsink then continue; 

if Qa G Qfi ^qb& Qf2 then Qf := Qf U {{qa,qb)}; 
foreach n G N such that 5'„(Ai) / A S'„(A2) / do 

foreach {qu, qin) G ^^(Ai) such that qa G {qii, qin) do 
foreach (921, • ■ • , q2n) G 5'„(A2) such that qb G (921, ■ • • , q2n) do 
if VI < i < n : {qu, q2i) G Q then 

tmp := RenameVariables {A2 {q2i, ■ ■ ■ ,q2n)jyj 
tmp := RenameVariables{tmp,x,y); 
sp := {qn,...,qin); 

tmp := Apply (Ai sp) tmp (intersect newStates); 

tmp := Trim Variables {tmp, y); 

A {{qn,q2i),---,{qin,q2n)) ■= 
RenameVariables {tmp, z,y); 

end if 

end foreach 

end foreach 

end foreach 

end if 

end while 

return r = {Q, T,Qf, A); 



32 end 
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a) MTBDD si b) MTBDD S2 c) res := Apply si S2 intersect 




d) trimmed := Trim Variables {res,y) e) RenameVariables{tr±mmed, z,y) 

Figure 5.5: An example of performing transducer composition of transducer r on it- 
self: TOT, for one super-state St, such that OX(s^) — > 3(1X), lO(sr) — )• 7(01). We as- 
sume that si = Sr and S2 = RenameVariables{RenameVariables{sr,y, z),x,y). Ordering 
, Cl, . . . , Qjii bn,Cn) is used. 
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Chapter 6 

Implementation 



This chapter describes design and implementation of a prototype of the Ubrary. It starts 



with description of the implementation of the type of MTBDDs as defined in Section 5.1 
This is followed by description of the object-oriented design of the implementation. 



6.1 MTBDD Package 

Since a smart and efficient implementation of MTBDDs is not trivial, it was decided that 
an existing library should be used instead of implementing an own BDD package. For 
this purpose, CUDD [29] (distributed free of charge under the new and simplified BSD 
licence [30 ), which is a C library implementing shared BDDs, ADDs (algebraic decision 
diagrams) and ZDDs (zero-suppressed decision diagrams), has been chosen. 

Using this library, we represent an MTBDD by an ADD [31] , which is in fact an MTBDD 
that puts emphasis on performing algebraic operations (such as addition, multiplication, or 
computation of logarithm) on sets of floating point numbers represented by the diagram. 
Despite such broad range of operations we use the data structure only for storage and 
retrieval of data and performing Apply operation. Because CUDD only allows to store 
floating point numbers into the sink nodes of MTBDDs, we had to deal with the problem 
to substitute those floating point numbers for sets of states of an automaton (as described 



in Section 5.1). We solved this problem by patching the library so that sink nodes would 
contain pointers to sets stored in another data structure, which serves as a pool of sets 
of states. In order to make use of MTBDD's space reduction, it must hold that there are 
never two equal sets of states in the pool. This means that two pointers point to the same 
set of states if and only if they are equal. 

As shared variation of MTBDD is used, a way to distinguish among individual MTBDDs 
in such structure needs to be defined. We use another data structure that provides mapping 
from the set of super-states of the transition function to roots of the shared MTBDD. The 



resulting wrapper over MTBDDs provided by CUDD is shown in Figure 6.1 

Due to the fact that many algorithms in Chapter [5] need to alter some data outside 
of MTBDD during an Apply operation, we also patched CUDD and support the following 
Apply operation: Apply ilhs, rhs, op), where Ihs and rhs are sets of states of the left 
and right MTBDD respectively, and op is a function object: an object that can be called 
like an ordinary function. Using op to pass pointers to data structures in main subroutines, 
we can avoid using global variables and thus making the code re-entrant and less prone to 
programming errors. 
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Figure 6.1: Wrapper over CUDD-provided MTBDDs. 



6.2 Object-Oriented Design 

C++ has been chosen as the implementation language because of its efficiency, good sup- 
port, means for modular design and an extensive standard library. We employ C++'s 
support of object-oriented programming paradigm to create a generic and modular design, 
which is further described in this section. 

In order to provide both modularity and good performance, policy-based design [32j is 
exploited in the object-oriented design of the library. This approach uses policy classes, 
which are classes that are not supposed to be instantiated (which can be enforced by 
making their constructor protected) or to only provide interface, but rather to provide a 
certain functionality when inherited by some class called the host class. Each policy class 
implements a particular interface called a policy. The host class is a class template, i.e. 
an incomplete class that does not name a type by itself but needs to be have its template 



arguments bound in order to do so, as shown in Figure 6.2 In addition to standard 



host class 




binding A 



host class<A> 

<<policy>> 

«policy» 



«incomplete type» 



«complete type» 



Figure 6.2: Binding of policy classes to host class. 
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template arguments the host class also defines its policies. Using multiple inheritance, 
several orthogonal policy classes may be inherited by host class. Due to the fact that policy 
classes of a host class are resolved statically during compile time, the compiler may perform 
certain optimizations, such as inlining of code, which is an advantage over using e.g. virtual 
methods. 

6.2.1 MTBDD Wrapper 

CUDDFacade is a class that was designed according to the facade design pattern ^33j . It 
is used as the access point to CUDD library that provides very extensive and confusing 
API. CUDDFacade provides a clean and type-safe interface with only those operations which 
are necessary for the implementation of the library, while hiding the others. The class is 
compiled with all CUDD's object files into a single static library which is then further used 
by the tree automata library. 



libcuddjacade.a 











CUDDFacade 






CUDDSharedMTBDD 

+SetValue() 
+GetValue() 
+Apply() 

+MonadicApply() 
...other methods... 


( CUDD )C 




...methods... 







Figure 6.3: Interface to CUDD package. 
CUDDSharedMTBDD is an object-oriented representation of a shared MTBDD as described 



in Section 6.1 The class uses CUDDFacade to access CUDD as depicted in Figure 6.3 



CUDDSharedMTBDD is a class template with the following template parameters: 

RootType Defines the type for accessing MTBDDs for individual super-states. This may 
be an arbitrary type, the prototype implementation uses unsigned. 

Leaf Type The type for the sink node of the MTBDD. This may again be an arbitrary type, 
the prototype implementation uses a set of states. However, deterministic automata 
may store the target state directly in the sink node thus making the access time 
shorter. 

VariableAssignmentType This template parameter determines the data type for represen- 
tation of assignment to Boolean variables of the MTBDD, i.e. the data type for the 
symbol. To fully utilise the potential of MTBDDs, our representation of assignment 
to Boolean variables of MTBDD can assign 3 different values to a variable: true (l), 
false (O) and don't care (X). By using the don't care value, we may work with 
whole sets of transitions over various symbols as with a single transition (for example 
to work with four transitions over symbols encoded as 100, 101, 110, and 111 at once, 
it is enough to use only one encoding IXX). 

Root Allocator The implementation of this policy defines the mapping of roots of RootType 
to CUDD-related pointers to root nodes of corresponding MTBDDs. 
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Leaf Allocator This policy determines the exact implementation of the mapping between 
the unsigned sink nodes stored in the patched CUDD data structures and those of 
type LeafType. 

These template parameters allow high configurability of CUDDSharedMTBDD, for instance 
to be used for deterministic finite tree automata or for finite (word) automata transition 
functions. 

CUDDSharedMTBDD also defines AbstractApplyFunctorType which is an abstract class 
of a function object with a single pure virtual method, which is overloaded operator (): 

virtual LeafType operator () (const LeafTypefe Ihs, const LeafTypefe rhs) = 0; 

Classes that inherit this abstract function object need to implement the only method by 
defining a function for Apply operation. The Apply operation takes root nodes of two MTB- 
DDs and an object of a class that implements the AbstractApplyFunctorType interface: 

RootType Apply (const RootTypefe Ihs, const RootTypefe rhs, 
AbstractApplyFunctorType* op) ; 

6.2.2 Transition Function 

The MTBDDTransitionFunction class represents transition functions of several automata 
using single MTBDD. This is because CUDD only allows executing Apply operation on 
MTBDD roots from the same shared MTBDD. When an automaton is being created, it 
registers to some MTBDDTransitionFunction and inserts all its transitions into this object. 

A challenging issue that needs to be faced is the choice of data structure for storage of 
super-states, i.e. mapping of super-states to their respective MTBDDs. Storage of nullary 
and unary super-states is obvious. Since there is only one nullary super-state for each 
automaton, these super-states are stored in a single array indexed by automaton number. 
Unary super-states of each automaton are stored in a separate array indexed by the only 
state's number. The prototype implementation also deals with storage of binary super- 
states by using a 2-dimensional matrix indexed by the two states of the super-state. Due 
to the fact that space requirements for n-dimensional matrix grow exponentially and the 
utilisation drops with almost the same speed for real- world problems, more sophisticated 
data structures need to be found. Our prototype implementation uses for storage of super- 
states with arity greater than 2 a hash table with an arbitrarily long vector of states as the 
key and pointer to MTBDD as the value. 

6.2.3 Tree Automaton 

The TreeAutomaton class represents a finite tree automaton with a high-level interface. 
The interface allows the use of human-readable names of states and symbols and provides 
mapping to their inner representation. TreeAutomaton enables adding a state, transition or 
marking a state as final. It also supports importing and exporting a finite tree automaton 
to or from a file. 

6.2.4 Automaton Import 

In order to support direct user interface to the library, the library supports reading a fi- 
nite tree automaton from a file. The reading interface is designed according to the builder 
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TABuilding Director 




AbstractTABuilder 


+Construct() 


+Build() 


O 



"A" 



TimbukTABuilder 

+Build() 



Figure 6.4: TABuildingDirector structure. 



design pattern [33^. Building a new finite tree automaton from a file is done by creat- 
ing an object of class TABuildingDirector and assigning an instance of class implement- 
ing the AbstractTABuilder interface to it. Then calling the Construct () method of 
TABuildingDirector (which further calls the Build () method of AbstractTABuilder) 
with a data stream that has a format recognized by the concrete builder constructs proper 
tree automaton. For testing purposes one concrete builder was implemented: 
TimbukTABuilder which accepts input data stream with description of automata in Timbuk- 
like format (see Section 3.1). 



6.2.5 Automaton Export 

This section deals with exporting description of a tree automaton into a human-readable 
format. In order to do so, the most difficult task is to extract transitions from symbolic 
representation into explicit. This operation needs to know the structure of the MTBDD 
used for representation of the transition function. This is achieved by using test symbols 
{xi, . . . , Xn) that start with value XXX. . .X and for each Boolean variable xi, . . . , x„ attempt 
to bind its value to both and 1. If it holds that the resulting MTBDDs for both bindings 
are the same, then the value of the variable is not important, it is left in X (don't care) 
and the procedure carries on to the following variables. In case the MTBDDs are not 
the same, the procedure splits into two branches and continues for both bindings. This 
continues until all variables have been either bound or left in X. 

A simple script that converts a file in the output format into a graphical representation 
of the automaton for dot tool [34] has been created. 

Unlike finite word automata, transition function and run of a finite tree automaton 
cannot be expressed simply as a labelled graph and walk (i.e. sequence of vertices and edges 
such that each vertex or edge may occur several times in the sequence) in the graph. Up to 
our knowledge there is no widely accepted standard graphical representation of finite tree 
automata, we therefore attempted to choose a simple and understandable representation 
resembling finite (word) automata as much as possible. As in finite (word) automata, states 
are represented by circles, final states by double circles. We introduce new type of vertices 
representing super-states which make the graph bipartite: an edge is either from a state to a 
super-state or from a super-state to a state (this may resemble a Petri net). Representation 
of super-states is by rectangles with boxes. The number of boxes determines the arity of the 
super-state. The labelling of edges from states to super-states denotes the position of the 
state in the super-state vector; the labelling of edges from super-states to states denotes the 
symbol over which the transition is to be made. An example of graphical representation of 
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,1XX 



Figure 6.5: Example of graphical representation of a tree automaton. 



a finite tree automaton is in Figure 6.5 
6.2.6 Operations 

Operations on finite tree automata with transition function represented using an MTBDD 
are provided by MTBDDOperation class. The prototype implementation implements the 
following operations: language union, language intersection, reduction of an automaton 
according to some equivalence class, and computation of downward simulation. All these 
operations are performed only using the interface provided by CUDDSharedMTBDD. 
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Chapter 7 

Evaluation 



This chapter provides an evaluation of the prototype implementation (called libSFTA) of 
the library described in the previous chapters. The tests were run on a laptop with a dual- 
core Intel Core 2 Duo CPU at 1.80 GHz and 2 GiB of available memory with Debian Squeeze 
GNU/Linux installed. We measured performance of the following three finite tree automata 
operations: language union, language intersection, and automaton reduction according to 
the downward simulation relation. 



7.1 Language Union 

The performance of the language union operation on two input finite tree automata was 



measured and compared to Timbuk, a tree automata library (described in Section 3.1) 
that performs operations on nondeterministic finite tree automata using an explicit rep- 
resentation of the transition function (note that the implemented library uses a symbolic 
representation). We made this choice because MONA immediately determinises input au- 
tomata so the comparison would not be fair. 

We performed the tests on binary tree automata over an alphabet with 130 symbols and 
various size of the state set obtained from tree model checking of real systems. It should 
be pointed out that the execution time for both libSFTA and Timbuk does not include 
the time necessary to load the automaton from a file. This should give more valid results, 
since building an MTBDD for a transition function is not a trivial operation (note that 
the comparison is fair from a practical point of view since within a verification framework 



automata are built internally, not loaded from a file). The results are given in Table 7.1 



(in a graphical form in Figure 7.1). It can be seen that libSFTA significantly outperforms 
Timbuk in all cases. 



Automata 


Timbuk 


hbSFTA 


A0053 


A0054 


1.982 s 


0.0005 s 


A0080 


A0082 


37.645 s 


0.0007s 


A0080 


AOlll 


37.645 s 


0.0008 s 


A0053 


A0246 


414.104 s 


0.0010s 


A0080 


A0246 


533.678 s 


0.0012 s 


A0082 


A0246 


542.069 s 


0.0012 s 



Table 7.1: Language union performance results. 
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7.2 Language Intersection 



The experiments with the language intersection operation on two input finite tree automata 
were also compared to Timbuk. These experiments were conducted with the same set of test 



automata and the same testing conditions as the language union operation (see Section 7.1). 
The results are given in Table 7.2 with the graph in Figure 7.2 The results show that 
for larger state sets Timbuk computes the finite tree automaton for language intersection 
sHghtly faster than HbSFTA. 



Automata 


Timbuk 


hbSFTA 


A0053 


A0054 


0.076 s 


0.057s 


A0053 


A0246 


0.609 s 


0.617s 


A0080 


A0082 


1.862 s 


1.675 s 


A0080 


AOlll 


2.483 s 


3.765 s 


A0080 


A0246 


6.062 s 


18.320s 


A0082 


A0246 


7.503 s 


19.355 s 



Table 7.2: Language intersection performance results. 
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7.3 Simulation Reduction 



The performance of reduction of a finite tree automaton according to downward simulation 
relation was measured in the next series of tests. The results were compared to SA [35], 
an OCaml tool implementing computation of downward simulation over labelled transition 
systems and finite tree automata. 

The execution time for libSFTA comprises computing downward simulation over the 
input tree automaton, computing symmetric closure of the relation and reducing the au- 
tomaton according to equivalence given by the simulation and its symmetric closure. As 
SA cannot perform reduction of the automaton, the execution time of SA consists of the 
time it takes to load the automaton from a file (which should be negligible according to the 
authors) and compute the downward simulation. Two different test cases were measured. 



1 . The first test case shows how the performance depends on the size of the state set of 
the input finite tree automaton for a fixed small alphabet. The results are given in 



Table 7.3 and Figure 7.3, We can see that the performance of libSFTA is worse when 
compared to SA. The reason for this is that SA uses a more sophisticated algorithm 
for computation of simulation. In the future version of the library, we wish to focus 
on optimising the algorithm we use in order to be able to compete with the solution 
used in SA even for smaller alphabets. 



Automaton 


States 


Transitions 


SA 


hbSFTA 


A0053 


53 


159 


0.04 s 


24.6 s 


A0054 


54 


241 


0.04 s 


29.3s 


A0063 


63 


571 


0.10s 


55.2s 


A0070 


70 


622 


0.07s 


71.5s 


A0080 


80 


672 


0.11s 


274.4 s 


A0082 


82 


713 


0.09s 


331.5s 


A0089 


89 


1006 


0.11s 


226.1s 



Table 7.3: Experimental results of simulation reduction for various state set size. 
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Figure 7.3: Performance comparison of simulation reduction for various state set size. 
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2. The second test case shows how the performance of hbSFTA and SA relates to the size 
of the alphabet of the input finite tree automaton for a fixed set of states. We created 
a set of simple tree automata that perform transitions over symbols from alphabet of 
various size. The number of transitions equals the number of symbols in the alphabet 
(we use one transition for every symbol) . The results of the experiments with the size 



of the alphabet are in Table 7.4 and Figure 7.4 It is clear that the use of symbolic 
representation makes the performance of libSFTA far superior to the performance of 
SA. 



Symbols 


SA 


hbSFTA 


1337 


0.06 s 


0.0033 s 


3525 


0.14s 


0.0051s 


7067 


0.26 s 


0.0071s 


15136 


0.69 s 


0.0054 s 


31235 


2.09 s 


0.0031s 


65503 


8.86 s 


0.0040 s 


130023 


48.40 s 


0.0045 s 



Table 7.4: Experimental results of simulation reduction for various alphabet size. 
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Figure 7.4: Performance comparison of simulation reduction for various alphabet size. 



7.4 Discussion 

The experiments described in this chapter showed that libSFTA has a good potential to 
become an interesting tree automata library, especially for applications that need large 
alphabets and can exploit symbolic representation well. The use of nondeterminism also 
considerably accelerates the computation of the automaton for language union and can keep 
automata small and clean. 
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Chapter 8 

Conclusion 



The aim of this Master's Thesis was to design and implement an efficient and flexible finite 
tree automata library for the use in symbolic formal verification, namely to be applicable 
for tree model checking techniques, such as regular tree model checking and abstract regular 
tree model checking. 

The theoretical background of tree automata has been studied as well as formal verifica- 
tion methods for systems represented by tree automata. Existing packages that implement 
tree automata have been surveyed and their advantages and disadvantages summarised. 
An analysis of the aforementioned verification methods have yielded a list of necessary 
requirements for the library. 

A representation of nondeterministic finite tree automata with symbolically represented 
transition functions has been proposed. The representation is based on MTBDDs provided 
by an external package (which may be changed though by writing a simple wrapper with 
given interface for another library). Algorithms that carry out standard as well as some 
verification-specific operations on tree automata using this representation have also been 
developed and described in this text. 

An object-oriented modular design of the library based on design patterns and policies 
has been created. A prototype implementation has been programmed, evaluated on test- 
ing data, and compared to other tools that provide the same functionality. The results of 
experiments show that the concepts we employed are viable and that the library can com- 
plement currently available tree automata libraries, especially when used for applications 
with finite tree automata that make use of large alphabets and nondeterminism. 

Future work will focus on redesigning the library according to our feedback from the 
implementation of the prototype and data collected from code profiling. Further, we wish 
to create a library that supports both explicitly and symbolically represented finite tree 
automata. We also plan to optimise the algorithms that are used by the library in order to 
give good performance even for small alphabets and large state sets. Another direction of 
work then includes implementing a support of further algorithms for simulation reduction 
(upward and combined simulation based relations) and antichain-based inclusion checking 
(combination of antichains and simulations), including further research on still more ad- 
vanced reduction and inclusion checking techniques. A next step should then be to test 
the library as a part of various verification tools (for instance as a base of abstract regular 
tree model checking tools or with decision procedures for various logics, such as WS2S or 
separation logic). 
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Appendix A 

Storage Medium 



A storage medium (DVD) containing an electronic version of the technical report and source 
code of the prototype implementation including patched CUDD package is enclosed to this 
thesis. 
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